Search CORE

11 research outputs found

ReCoil - an algorithm for compression of extremely large datasets of dna data

Author: Adam L Buchsbaum
Alok Aggarwal
Bin Ma
Christos Kozanitis
Daniel D Sommer
David Eppstein
M Waterman
Markus Fritz Hsi-Yang
P Ferragina
Paolo Ferragina
R Dementiev
Roman Dementiev
Scott Christley
Veli Mäkinen
Vladimir Yanovsky
W Timothy White
Wenyu Zhang
Xin Chen
Z Ning
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

The growing volume of generated DNA sequencing data makes the problem of its long term storage increasingly important. In this work we present ReCoil - an I/O efficient external memory algorithm designed for compression of very large collections of short reads DNA data. Typically each position of DNA sequence is covered by multiple reads of a short read dataset and our algorithm makes use of resulting redundancy to achieve high compression rate

University of Toronto Research Repository

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

CGGBP1 mitigates cytosine methylation at repetitive DNA sequences

Author: B Langmead
Bengt Westermark
BH Ramsahoye
D Biniszkiewicz
D Blankenberg
D Cortazar
DC Hancks
DM Messerschmidt
EL Fritz
F Butter
F Fuks
F Krueger
F Naumann
H Deissler
H Deissler
H Gowher
H Muller-Hartmann
Helena Jernberg Wiklund
KD Robertson
KD Robertson
KD Robertson
KI Tatematsu
LS Chuang
M Fatemi
M Okano
Markus Hsi-Yang Fritz
MR Rountree
P Rice
Paul Collier
Prasoon Agarwal
R Schipper
S Cortellino
S Pradhan
S Tempel
U Singh
U Singh
Umashankar Singh
Vladimir Benes
W Guo
X Zhang
Y Shimooka
ZD Smith
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Genetic code expansion for multiprotein complex engineering

Author: A Bianco
A Chatterjee
A Goldhirsch
A Maiolica
AC Wolff
Attila Gyenesei
Bence Galik
C Bieniossek
Carsten Schultz
CC Liu
Christine Koehler
DJ Fitzgerald
E Provenzano
E Sisamakis
EA Lemke
Edward A Lemke
G Hernandez Jr.
Gemma Estrada Girona
Giancarlo Pruneri
Hueseyin Besir
I Berger
I Nikić
Imre Berger
J Cox
J Rappsilber
Jan O Korbel
Jan-Erik Hoffmann
Jonathan J M Landry
JT Simpson
Juan Zou
Juri Rappsilber
JW Chin
JY Axup
Kapil Gupta
Ksenija Radic
M Zhang
Markus Hsi-Yang Fritz
Martin Jechlinger
Mirella Wawryszyn
MM Robinson
Moritz Bosse Biskup
MY Polley
Paul F Sauter
Peggy Stolt-Bergner
Piau Siong Tan
PR Chen
R Luo
RJ Tomko Jr.
S Milles
S Milles
S Tyagi
Sini Junttila
SM Hancock
SM Kraemer
SS Thakur
Stefan Braese
T Crépin
T Magoč
T Mukai
T Mukai
T Plass
T Plass
Vladimir Benes
ZA Chen
Zhuo A Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2016
Field of study

We present a baculovirus-based protein engineering method that enables site-specific introduction of unique functionalities in a eukaryotic protein complex recombinantly produced in insect cells. We demonstrate the versatility of this efficient and robust protein production platform, \u2018MultiBacTATAG\u2019, (i) for the fluorescent labeling of target proteins and biologics using click chemistries, (ii) for glycoengineering of antibodies, and (iii) for structure\u2013function studies of novel eukaryotic complexes using single-molecule F\uf6rster resonance energy transfer as well as site-specific crosslinking strategies

Crossref

AIR Universita degli studi di Milano

Edinburgh Research Explorer

Explore Bristol Research

Efficient storage of high throughput DNA sequencing data using reference-based compression

Author: Birney Ewan
Cochrane Guy
Hsi-Yang Fritz Markus
Leinonen Rasko
Publication venue: Cold Spring Harbor Laboratory Press
Publication date
Field of study

Data storage costs have become an appreciable proportion of total cost in the creation and analysis of DNA sequence data. Of particular concern is that the rate of increase in DNA sequencing is significantly outstripping the rate of increase in disk storage capacity. In this paper we present a new reference-based compression method that efficiently compresses DNA sequences for storage. Our approach works for resequencing experiments that target well-studied genomes. We align new sequences to a reference genome and then encode the differences between the new sequence and the reference genome for storage. Our compression method is most efficient when we allow controlled loss of data in the saving of quality information and unaligned sequences. With this new compression method we observe exponential efficiency gains as read lengths increase, and the magnitude of this efficiency gain can be controlled by changing the amount of quality information stored. Our compression method is tunable: The storage of quality scores and unaligned sequences may be adjusted for different experiments to conserve information or to minimize storage costs, and provides one opportunity to address the threat that increasing DNA sequence volumes will overcome our ability to store the sequences

Crossref

PubMed Central

CGGBP1 mitigates cytosine methylation at repetitive DNA sequences

Author: Agarwal Prasoon
Benes Vladimir
Collier Paul
Fritz Markus Hsi-Yang
Singh Umashankar
Westermark Bengt
Wiklund Helena Jernberg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: CGGBP1 is a repetitive DNA-binding transcription regulator with target sites at CpG-rich sequences such as CGG repeats and Alu-SINEs and L1-LINEs. The role of CGGBP1 as a possible mediator of CpG methylation however remains unknown. At CpG-rich sequences cytosine methylation is a major mechanism of transcriptional repression. Concordantly, gene-rich regions typically carry lower levels of CpG methylation than the repetitive elements. It is well known that at interspersed repeats Alu-SINEs and L1-LINEs high levels of CpG methylation constitute a transcriptional silencing and retrotransposon inactivating mechanism. Results: Here, we have studied genome-wide CpG methylation with or without CGGBP1-depletion. By high throughput sequencing of bisulfite-treated genomic DNA we have identified CGGBP1 to be a negative regulator of CpG methylation at repetitive DNA sequences. In addition, we have studied CpG methylation alterations on Alu and L1 retrotransposons in CGGBP1-depleted cells using a novel bisulfite-treatment and high throughput sequencing approach. Conclusions: The results clearly show that CGGBP1 is a possible bidirectional regulator of CpG methylation at Alus, and acts as a repressor of methylation at L1 retrotransposons

Springer - Publisher Connector

Publikationer från Uppsala Universitet

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Primate genome architecture influences structural variation mechanisms and functional consequences.

Author: Benes Vladimir
Fritz Markus Hsi-Yang
Gokcumen Omer
Iskow Rebecca C
Korbel Jan O
Langdon Amy
Lee Charles
Lee Eunjung
Mills Ryan E
Park Peter J
Pavlidis Pavlos
Stütz Adrian M
Tica Jelena
Tischler Verena
Zhu Qihui
Publication venue: The Mouseion at the JAXlibrary
Publication date: 06/09/2013
Field of study

Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages. Proc Natl Acad Sci U S A 2013 Sep 24; 110(39):15764-15769

The Jackson Laboratory: The Mouseion at the JAXlibrary

PubMed Central

CBFB-MYH11 hypomethylation signature and PBX3 differential methylation revealed by targeted bisulfite sequencing in patients with acute myeloid leukemia

Author: A Akalin
A Mandoli
Arnošt Kostečka
AS Pitiot
C Carella
C Chang
C Langer
C Rohde
Cedrik Haškovec
CY McLean
Cyril Šálek
D Grimwade
D Jiang
D Sproul
DR de la Bletiere
F Krueger
F Krueger
GJ Dickson
H Itonaga
H Shiah
Hana Hájková
J Markova
J Markova
Jana Marková
Jiří Schwarz
K Mrozek
Kyra Michalová
M Caligiuri
M Heuser
M Ivanov
M Lohse
M Schneider
M Sonnet
Markus Hsi-Yang Fritz
Martin Vostrý
ME Figueroa
Michaela Dostálová Merkerová
O Fuchs
Ota Fuchs
P Valk
PA Jones
Petr Cetkovský
R Suzuki
Vladimír Beneš
Z Li
Zdeněk Krejčík
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

An integrated map of structural variation in 2,504 human genomes

Author: Abyzov Alexej
Alkan Can
Antaki Danny
Auton Adam
Bae Taejeong
Bashir Ali
Batzer Mark A.
Casale Francesco Paolo
Cerveira Eliza
Chaisson Mark J.P.
Chen Jieming
Chen Ken
Chines Peter
Chong Zechen
Clarke Laura
Dal Elif
Dayama Gargi
Devine Scott E.
Ding Li
Eichler Evan E.
Emery Sarah
Fan Xian
Flicek Paul
Fritz Markus Hsi-Yang
Gardner Eugene J.
Garrison Erik
Gerstein Mark B.
Gibbs Richard A.
Gujral Madhusudan
Handsaker Robert E.
Hormozdiari Fereydoun
Huddleston John
Jun Goo
Kahveci Fatma
Kashin Seva
Kidd Jeffrey M.
Kong Yu
Konkel Miriam K.
Korbel Jan O.
Lam Hugo Y. K.
Lameijer Eric-Wubbo
Lee Charles
Malhotra Ankit
Malig Maika
Marth Gabor
Mason Christopher E.
McCarroll Steven A.
McCarthy Shane
Meiers Sascha
Menelaou Androniki
Mills Ryan E.
Mu Xinmeng Jasmine
Muzny Donna M.
Nelson Bradley J.
Noor Amina
Parrish Nicholas F.
Pendleton Matthew
Quitadamo Andrew
Raeder Benjamin
Rausch Tobias
Romanovitch Mallory
Schadt Eric E.
Schlattl Andreas
Sebat Jonathan
Sebra Robert
Shabalin Andrey A.
Shi Xinghua
Stegle Oliver
Stütz Adrian M.
Sudmant Peter H.
Untergasser Andreas
Walker Jerilyn A.
Walter Klaudia
Wang Min
Ye Kai
Yu Fuli
Zhang Chengsheng
Zhang Jing
Zhang Yan
Zheng-Bradley Xiangqun
Zhou Wanding
Zichner Thomas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2015
Field of study

Summary Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association

The Jackson Laboratory: The Mouseion at the JAXlibrary

Harvard University - DASH

PubMed Central

eScholarship - University of California